Estimating Continuous Distributions in Bayesian Classi ers
نویسنده
چکیده
When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classiier, we present experimental results on a variety of natural and ar-tiicial domains, comparing two methods of density estimation: assuming normality and modeling each conditional distribution with a single Gaussian; and using nonparamet-ric kernel density estimation. We observe large reductions in error on several natural and artiicial data sets, which suggests that kernel estimation is a useful tool for learning Bayesian models.
منابع مشابه
A Brief Note on Maximum Realisable Mcmc Classifiers
We present a novel and powerful strategy for estimating and combining classi ers via ROC curves, decision analysis theory and MCMC simulation. This paradigm also allows us to select samples from an MCMC run in a parsimonious and optimal fashion. Each ROC curve, corresponds to a sample (classi er) obtained with a full Bayesian model, which treats the model dimension, model parameters, regularisa...
متن کاملFirst order Gaussian graphs for e#cient structure classi$cation
First order random graphs as introduced by Wong are a promising tool for structure-based classi$cation. Their complexity, however, hampers their practical application. We describe an extension to $rst order random graphs which uses continuous Gaussian distributions to model the densities of all random elements in a random graph. These First Order Gaussian Graphs (FOGGs) are shown to have severa...
متن کاملComparing Gaussian and Polynomial Classi cation in SCHMM-Based Recognition Systems
Semi-continuous Hidden Markov Models (SCHMM) with gaussian distributions are often used in continuous speech or handwriting recognition systems. Our paper compares gaussian and tree-structured polynomial classi ers which have been successfully used in pattern recognition since many years. In our system the binary classi er tree is generated by clustering HMM states using an entropy measure. For...
متن کاملComparing Gaussian and polynomial classification in SCHMM-based recognition systems
Semi-continuous Hidden Markov Models (SCHMM) with gaussian distributions are often used in continuous speech or handwriting recognition systems. Our paper compares gaussian and tree-structured polynomial classi ers which have been successfully used in pattern recognition since many years. In our system the binary classi er tree is generated by clustering HMM states using an entropy measure. For...
متن کاملA Simple Approach to Building Ensembles of Naive Bayesian Classi ers for Word Sense Disambiguation
This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classi ers, each of which is based on lexical features that represent co{occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results disambiguating the widely studied nouns line and interest show that such an ensemble achieves acc...
متن کامل